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MODULAR SYSTEM FOR ACCELERATING DATA SEARCHES 
AND DATA STREAM OPERATIONS 



to a 



TECHNICAL FIELD OF THE INVENTION 

This invention generally relates to integrated circuit computing 
devices and to computer system designs. More spedficaUy, it relates 
combination of memory devices and Reld-Programmable gate Arrays 
together forming a Module ^ch can be used to accelerate Ust processing 
funcUons such as daubase searches, speech recognition, speech or text 
translation, data stream transformation as in video or image editing, or 
routing of communications messa^. 
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BACKGROUND OF THE INVE^mON 

Most computet, use a simple architeaure of a single memory sub- 
system and a single processor or set of processors accessing that memory 

As a result, many sy^ems are unable to perform s<MaUed data-strfeam 
operations effldendy. and. are limited by the memory bandpass of the 
system in achieving total performance. 

TTiis limits the abUity of the conventional computer to handle large 
datasets(data.streams)atanadequateleveIofperformance. Such 
performance limitation prevents deployment of. for example, speech to text 
and automated translation systems. 

To overcome this issue, and to provide the capabiUty that 
demanding data-stream operations place on a system, it is necessary to- 
(a) increase memory bandwidth, (b) increase compute power by parallel 
processing, and (c) define a compute^comparison engine nmning at very 
hi^ speed. 

The invention described herein addresses each of these issues and 
achieves a dramatic performance boost for ti^ese type of operations. It 
provides a means to increase memory bandwidtii by adding semi- 
autonomous Modules, it adds several layeis of parallelism in computing, 
data transforming or comparison. The architeaure is designed for die 
specific set of tasks required, but. since it is based on Reconfigurable Logic 
the electronic drcuits on ^di it is based can be rapidly modified at any ' 
given point in time to be optimal in configuration for ti^e task at hand 
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Using a combination of memory devices and field Programmable 
gate Arrays (FPGA). it is possible to build a modular system for 
accelerating data searches to much higher levels of performance than can 
bed reaUzed with a simple computer system. This system achieves 
performance improvement in several ways. 

First, the use of a combination of memory devices and FPGA's 
allows a much higher effective memory access rate than conventional 
computer architeaures. witi. total memory bandpass increasing as each 
new module is added. 

Second, because the architecture is independem of any computer 
stnicture die speed of access of each module to its memory component can 
be optimized to take advantage of special high-speed memory access modes 
such as fast page mode. 

Third, tiie comparisons and other functions take place at hardware 
speeds, since tiie modular architect described herein does not require tite 
structure of program steps typicaUy seen in a conventional computer 
system. 

Fourtii. complex comparisons that involve logical or matitematical 
transforms of either ti^e Search Dst data or the Search Target data can 
occur in a pipelined stream of hardware operations, permitting very 
sophisticated and complex operations, which, again, occur at hardware 

Speeds. 

The memory devices and FPGA's ti>at make up a module can be 
packaged togeti^er in a variety of ways. Packaging choices include placing 
the elements on an adapter card that plugs into the computer bus. or into 
a special bus dedicated to the search functions. To achieve a dense and " 
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flexible packaging means, the combinaUon of devices that makes up a 
module can be packaged onto a SIMM, DIMM or similar plug-in module. 
This permits the modules to be packed dosely together, and allows the 
system designer choices as to whether the module is inserted into the . 
sockets on the main processor board, or into sockets on a separate adapter 

card, where the constraints of the computer memory system can be 
ignored. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

For a more complete understanding of the present invention and 
for further advantages thereof, reference is nowmade to the foUo..dng 
Description of the Preferred Embodiments taken in conjunction with the 
accompanying Drawings in which: 

Figure 1 identified the basic struaure of this invention, showing 
the connecuon of the various elements and optional elements and the 
function of the interconnections. 

Figure 2 is an alternative method of connecting the elements 
together. 

Figure 3 shows the functional content of the FPGA(s). 
Figure 4 identified the incorporation of a processor or 
programmable controller element into the FPGA(s). 

Figure 5 demonstrates how parallel fimction is achieved within a 
FPGA(s). 

Figure 6 shows the prefenred packaging scheme. 
Hgure 7 shows a scheme for connection of multiple Modules. 
Figure 8 shows the comiection of multiple Modules to operate on a 
large number of characters in parallel. 

Figure 9 shows how multiple parallel comparisons are made using 
the same data lists. 
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DESCRIPTION OF THE PREFERRED EMBODIMENT 

The most basic embodiment of this invention is shown in Figure 1 
Data and control (such as timing signals and address signals) are 
transferred from the computer on the Bus Data and Control lines 1 and 
into the FPGA 2 and/or additional optional FPCSA's 3. 

In the FPGA's 2,3, the data and control signals are modified to 
generate the Modified Data and Control signals 4 which are used to 

control the acUons and contents ofthe memoiy Devices 6. Such 
modifications may include: I ) generating different address values than the 
one sent by the computer, 2) generating the required control and address 
values to permit reading data from the Memory Devices 5 to compare with 
values loaded into the FPGA(s) 2.3. 

There are alternative methods of connecting the Memory Devices 

5 to the FPGA(s) 2,3. Figure 2 shows one such alternative method, where 
the same Modified Data and Control 4 are shared by all the FPGA's 2,3 
as opposed to the method shown in Rg. l where different Modified oltl 
and Control 4,5 go to each FPGA 2,3. Such alternative methods are 

reconfigurable by connection ofdifferent logic in the FPGA(s) 2,3. This 
aUorn different operations to be performed in the several FPGA(s) 2 3 in 
the case of Figure 1 , while the medtod of Figure 2 permits operaUon on the 
same or related data. 

In Figure 3. the elements within the FPGA(s) 2.3 are detailed 
Here is shown how data from the Memory Devices containing Search Usts 

6 are moved into and from the FPGA(s) 2.3 with some combinaUon of 
Transforms 7. Math Functions 8 and Comparators 9 being used to modify 
and/or examine the data. For clarity, only one of each such Transform 7 
Math Funcdon 8 or Comparators 9 is shown. A typical embodiment 
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might have several of each, in any order, connected to operate 
consecutively on data. Tht control logic 10 manages the sequence of 
events inside FPGA(s) 23. 

To effect a typical search, data constituting Search lists are placed 
into the Memory Devices 6. Depending on appUcaUon of the 
embodiment, this might be done by using rapidly reprogrammable 
Memory Devices, such as dynamic Random Access Memory (DRAM) or 
Static Random Access Memory (SRAM), semi-static memory devices that 
are typically programmed infrequently or only at the time of initial 
assembly of tiie embodiment, such as HASH memory or dectiically 
Erasable Programmable read<)nly Memory (EEPROM) or one-time 
programmable Memory Devices such as Mask-Programmable Read-Only 
Memory (ROM). 

Following tht placement of the Search List dau. tiie FPGA(s) 2.3 
are re-programmed from an initial start-up state to be able to manipulatl 
the Search Ust data now stored in Uie Memory Devices 6. Such 
manipulations are effected by placing the functional elements. Transfomts 
7, Mati,Ax>gic Functions 8 and Comparators 9 in any sequence or quantity 
to act upon seleaed data elements of die Search List data. * 

A data item (SeanJ, Target) to be compared against the Search List 
is placed into FPGA's 2,3. Data from the Search List are then moved, data 
item by data item, into tiie FPGA(s) 2.3. where the instantiated 
Transforms 7 and MatiiAx,gic Functions 8 operate on said Search List data 
item, following which said modified data item is compared witit tiie Search 
Target inside Comparator 9. If a match is found between tine Search 
Target and the Search Ust data item, the Control Logic 10 tiien informs 
the computer tiiat a match has been found. Said Control Logic maybe . 
programmed to continue on for additional matches to tiie same Search 
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target data, or re-loadcd with nm Search Target data, and the Search Ust 
and FPGAcontents may be changed at any dme as required to optimize 
performance. 

Figure 4 extends the concept described above to aUow a 
programmable controUer or processor 1 1 to be instantiated into the FPGA. 
This permits much greater flexibiUty in operation, since the sequence of 
hardware events, and the interaction of the module(s) with a host 
computer are capable of being modified. 

Hgure 5 shows an extension of the embodiment where multiple 
search operations occur in parallel. This is realized by instantiating sets of 
the various Transforms 7, Math/Logic Functions 8 and Comparators 9 into 
FPGA(s) I (etc.) and loading either the same or different Search Target 
data elements to correspond with each such set. which may contain 
different sizes and types of transforms 7. functions 8 and comparators 9. 
The operation in such multiple search mode foUows the sequence above for 
a single search path, with the set of Search Target data items being 
compared with either the same Search Ust data items, as (optionally) 
modified by the (possibly different) set of Transforms 7 that are applied in 
each search path, or with different Search Ust data items, similarly 
modified. 

The preferred packaging scheme (Figure 6) for the Modules is the 
SIMM. In tills means. Memory Devices 6 and FPGA's 1,2 are mointed 
on one of several industry-standard form-factor boards to make a Module. 
This permits a very dense package, taking up a small physical space, and 
advantageously is supported by many computer systems. Alternative 
packaging schemes include the industry standard PCM-CIA bus card, the 
DIMM card, the small footprint PQ card and many otiier standard form 
faaors. 
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In a typical appUcation, several Modules 16 will be mounted 
together to achieve modular increments of power. Rgure 7 shows such a 
configuration. Note that each Mbdule shares the Data and Control signals 
to the computer. This permits each Module 16 to be loaded with Search 
Data. Search Target and control information, and to communicate with 
the computer, while aUowing the autonomous parallel operation of the 
Modules 1 6 during the searching or modifying of data. 

The Modules 1 6 can also be connerted in such a way as to 
communicate with each other. This pem^ts comparison of very wide data 
elements, which might be useful in image or speech processing, for 
example. Figure 8 shows a means where this might be achieved by sharing 
the computer Data and Control Box I . which is connerted to all of the 
Modules, as an intercommunication path 12 between each Module. 
Determination of the success or otherwise of die search or data 
modification operations can be realized by either the computer system or a 
specially programmed Module 13. 

Another method of using the Module architecture, shown in Figure 
9, is to build several paraUel search or transform paths in each Module. 
This can be done within a single FPGA, as shown, or within multiple 
FPGAs mounted on the same module and sharing the same data. This 
method has the benefit that different transforms, mathematical operations 
or comparison methods can be deployed in paraUel, to aa on the same 
data. or. if appropriate, different data, as required. This aUows, in some 
circumstances, for a large multiplication of performance of the modular 
system. 
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I . A data processing module adapted to be conneaed to a 

computer forusewithacomputer,the computer indudingamemory for 
stonng data, the module comprising: 

a module memory for storing data; and 

a programmable logic device comtected to said module memory 
and adapted to be connected to the computer for receiving data stored in 
sard module memory and the computer memory for processing data. 

2. Themoduleofaaiml^ereinsaidprogrammablelogic 
device includes a comparator for determining whether data stored in the 
computer memory is stored in said module memory. 

3. The module of Qaim 1 wherein said programmable logic 
device is programmable by data stored in said module men.ory for 
processing data stored in said module memory. 

4. The module of Claim 1 wherein said module memory 
mdudes a random access memory device. 

5. Tht module of Claim 1 wherein said module memory and 
programmable logic device are mounted on a single in-line memory 

module having terminals for comtection to the computer 
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6. A data processing system for use with a computer the 
computer including a memory for storing data, the system comprising: 

a plurality of data processing modules, adapted to be connected to 
the computer, each of said modules including: 

a module memory for storing data; and 
a programmable logic device comieaed to said module 
memory and adapted to be comiected to the computer for 
recehdng data stored in said module memory and the 
computer memory; and 
such that s«d pluraUty of dau proo^ modute sunulv^fy 

proc«5 dau st<«d in each of said module memories and the computer 

memory. 

7- The system of Qaim 6 wherein said programmable logic 
devices include a comparator for determining Mdiether data stored in the 
computer memory is stored in said module memories. 

8. The system of Claim 6 and further including: 
means for transferring data between said plurality of data 

processing modules. 

9. -n^e system of Qaim 6 wherein ones of said programmable 
logrc devices perform comparisons on said data stored in said module 
memories. 
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